Margin Distribution and Learning
نویسندگان
چکیده
Recent theoretical results have shown that improved bounds on generalization error of classifiers can be obtained by explicitly taking the observed margin distribution of the training data into account. Currently, algorithms used in practice do not make use of the margin distribution and are driven by optimization with respect to the points that are closest to the hyperplane. This paper enhances earlier theoretical results and derives a practical data-dependent complexity measure for learning. The new complexity measure is a function of the observed margin distribution of the data, and can be used, as we show, as a model selection criterion. We then present the Margin Distribution Optimization (MDO) learning algorithm, that directly optimizes this complexity measure. Empirical evaluation of MDO demonstrates that it consistently outperforms SVM.
منابع مشابه
Large Margin Distribution Learning
Support vector machines (SVMs) and Boosting are possibly the two most popular learning approaches during the past two decades. It is well known that the margin is a fundamental issue of SVMs, whereas recently the margin theory for Boosting has been defended, establishing a connection between these two mainstream approaches. The recent theoretical results disclosed that the margin distribution r...
متن کاملDistribution-dependent sample complexity of large margin learning
We obtain a tight distribution-specific characterization of the sample complexity of large-margin classification with L2 regularization: We introduce the margin-adapted dimension, which is a simple function of the second order statistics of the data distribution, and show distribution-specific upper and lower bounds on the sample complexity, both governed by the margin-adapted dimension of the ...
متن کاملOptimal Margin Distribution Machine
Support vector machine (SVM) has been one of the most popular learning algorithms, with the central idea of maximizing the minimum margin, i.e., the smallest distance from the instances to the classification boundary. Recent theoretical results, however, disclosed that maximizing the minimum margin does not necessarily lead to better generalization performances, and instead, the margin distribu...
متن کاملExploiting diversity for optimizing margin distribution in ensemble learning
Margin distribution is acknowledged as an important factor for improving the generalization performance of classifiers. In this paper, we propose a novel ensemble learning algorithm named Double Rotation Margin Forest (DRMF), that aims to improve the margin distribution of the combined system over the training set. We utilise random rotation to produce diverse base classifiers, and optimize the...
متن کاملMargin Distribution and Learning Algorithms
Recent theoretical results have shown that improved bounds on generalization error of classifiers can be obtained by explicitly taking the observed margin distribution of the training data into account. Currently, algorithms used in practice do not make use of the margin distribution and are driven by optimization with respect to the points that are closest to the hyperplane. This paper enhance...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003